Dataset statistics
| Number of variables | 24 |
|---|---|
| Number of observations | 58693 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 10.7 MiB |
| Average record size in memory | 192.0 B |
Variable types
| NUM | 12 |
|---|---|
| BOOL | 9 |
| CAT | 3 |
Reproduction
| Analysis started | 2020-05-14 09:33:40.499686 |
|---|---|
| Analysis finished | 2020-05-14 09:34:36.359587 |
| Duration | 55.86 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
Brand has 44055 (75.1%) zeros | Zeros |
Quantity has 44055 (75.1%) zeros | Zeros |
Last_Inc_Brand has 44133 (75.2%) zeros | Zeros |
ID
Real number (ℝ≥0)
| Distinct count | 500 |
|---|---|
| Unique (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 200000252.89748353 |
|---|---|
| Minimum | 200000001 |
| Maximum | 200000500 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 458.5 KiB |
Quantile statistics
| Minimum | 200000001 |
|---|---|
| 5-th percentile | 200000029 |
| Q1 | 200000128 |
| median | 200000252 |
| Q3 | 200000378 |
| 95-th percentile | 200000476 |
| Maximum | 200000500 |
| Range | 499 |
| Interquartile range (IQR) | 250 |
Descriptive statistics
| Standard deviation | 144.3166768 |
|---|---|
| Coefficient of variation (CV) | 7.215824717e-07 |
| Kurtosis | -1.206182483 |
| Mean | 200000252.9 |
| Median Absolute Deviation (MAD) | 125 |
| Skewness | -0.00945628919 |
| Sum | 1.173861484e+13 |
| Variance | 20827.30321 |
| Value | Count | Frequency (%) | |
| 200000187 | 358 | 0.6% | |
| 200000041 | 353 | 0.6% | |
| 200000247 | 347 | 0.6% | |
| 200000097 | 182 | 0.3% | |
| 200000297 | 179 | 0.3% | |
| 200000393 | 179 | 0.3% | |
| 200000345 | 178 | 0.3% | |
| 200000399 | 178 | 0.3% | |
| 200000064 | 175 | 0.3% | |
| 200000351 | 173 | 0.3% | |
| Other values (490) | 56391 | 96.1% |
| Value | Count | Frequency (%) | |
| 200000001 | 101 | 0.2% | |
| 200000002 | 87 | 0.1% | |
| 200000003 | 97 | 0.2% | |
| 200000004 | 85 | 0.1% | |
| 200000005 | 111 | 0.2% |
| Value | Count | Frequency (%) | |
| 200000500 | 124 | 0.2% | |
| 200000499 | 106 | 0.2% | |
| 200000498 | 131 | 0.2% | |
| 200000497 | 120 | 0.2% | |
| 200000496 | 120 | 0.2% |
Day
Real number (ℝ≥0)
| Distinct count | 730 |
|---|---|
| Unique (%) | 1.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 349.43107355221235 |
|---|---|
| Minimum | 1 |
| Maximum | 730 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 458.5 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 32 |
| Q1 | 161 |
| median | 343 |
| Q3 | 530 |
| 95-th percentile | 689 |
| Maximum | 730 |
| Range | 729 |
| Interquartile range (IQR) | 369 |
Descriptive statistics
| Standard deviation | 212.0450583 |
|---|---|
| Coefficient of variation (CV) | 0.6068294274 |
| Kurtosis | -1.216555375 |
| Mean | 349.4310736 |
| Median Absolute Deviation (MAD) | 184 |
| Skewness | 0.09270947784 |
| Sum | 20509158 |
| Variance | 44963.10673 |
| Value | Count | Frequency (%) | |
| 395 | 179 | 0.3% | |
| 51 | 173 | 0.3% | |
| 18 | 168 | 0.3% | |
| 25 | 165 | 0.3% | |
| 58 | 161 | 0.3% | |
| 11 | 161 | 0.3% | |
| 449 | 160 | 0.3% | |
| 70 | 159 | 0.3% | |
| 117 | 158 | 0.3% | |
| 44 | 157 | 0.3% | |
| Other values (720) | 57052 | 97.2% |
| Value | Count | Frequency (%) | |
| 1 | 113 | 0.2% | |
| 2 | 29 | < 0.1% | |
| 3 | 81 | 0.1% | |
| 4 | 136 | 0.2% | |
| 5 | 70 | 0.1% |
| Value | Count | Frequency (%) | |
| 730 | 57 | 0.1% | |
| 729 | 50 | 0.1% | |
| 728 | 32 | 0.1% | |
| 727 | 102 | 0.2% | |
| 726 | 99 | 0.2% |
Incidence
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 458.5 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 44055 | 75.1% | |
| 1 | 14638 | 24.9% |
| Distinct count | 6 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.8443085206072274 |
|---|---|
| Minimum | 0 |
| Maximum | 5 |
| Zeros | 44055 |
| Zeros (%) | 75.1% |
| Memory size | 458.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.633083365 |
|---|---|
| Coefficient of variation (CV) | 1.934225849 |
| Kurtosis | 1.368604017 |
| Mean | 0.8443085206 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.714161805 |
| Sum | 49555 |
| Variance | 2.666961277 |
| Value | Count | Frequency (%) | |
| 0 | 44055 | 75.1% | |
| 5 | 4978 | 8.5% | |
| 2 | 4542 | 7.7% | |
| 4 | 2927 | 5.0% | |
| 1 | 1350 | 2.3% | |
| 3 | 841 | 1.4% |
| Value | Count | Frequency (%) | |
| 0 | 44055 | 75.1% | |
| 1 | 1350 | 2.3% | |
| 2 | 4542 | 7.7% | |
| 3 | 841 | 1.4% | |
| 4 | 2927 | 5.0% |
| Value | Count | Frequency (%) | |
| 5 | 4978 | 8.5% | |
| 4 | 2927 | 5.0% | |
| 3 | 841 | 1.4% | |
| 2 | 4542 | 7.7% | |
| 1 | 1350 | 2.3% |
| Distinct count | 16 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.6919734891724737 |
|---|---|
| Minimum | 0 |
| Maximum | 15 |
| Zeros | 44055 |
| Zeros (%) | 75.1% |
| Memory size | 458.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 4 |
| Maximum | 15 |
| Range | 15 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.498734039 |
|---|---|
| Coefficient of variation (CV) | 2.16588361 |
| Kurtosis | 11.49524955 |
| Mean | 0.6919734892 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.944460607 |
| Sum | 40614 |
| Variance | 2.24620372 |
| Value | Count | Frequency (%) | |
| 0 | 44055 | 75.1% | |
| 3 | 4241 | 7.2% | |
| 2 | 3800 | 6.5% | |
| 1 | 3542 | 6.0% | |
| 4 | 1165 | 2.0% | |
| 5 | 921 | 1.6% | |
| 6 | 355 | 0.6% | |
| 7 | 200 | 0.3% | |
| 8 | 133 | 0.2% | |
| 9 | 106 | 0.2% | |
| Other values (6) | 175 | 0.3% |
| Value | Count | Frequency (%) | |
| 0 | 44055 | 75.1% | |
| 1 | 3542 | 6.0% | |
| 2 | 3800 | 6.5% | |
| 3 | 4241 | 7.2% | |
| 4 | 1165 | 2.0% |
| Value | Count | Frequency (%) | |
| 15 | 2 | < 0.1% | |
| 14 | 5 | < 0.1% | |
| 13 | 20 | < 0.1% | |
| 12 | 29 | < 0.1% | |
| 11 | 38 | 0.1% |
| Distinct count | 6 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.8407987323871671 |
|---|---|
| Minimum | 0 |
| Maximum | 5 |
| Zeros | 44133 |
| Zeros (%) | 75.2% |
| Memory size | 458.5 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 5 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.63162799 |
|---|---|
| Coefficient of variation (CV) | 1.940569041 |
| Kurtosis | 1.391106065 |
| Mean | 0.8407987324 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.721146777 |
| Sum | 49349 |
| Variance | 2.662209898 |
| Value | Count | Frequency (%) | |
| 0 | 44133 | 75.2% | |
| 5 | 4972 | 8.5% | |
| 2 | 4491 | 7.7% | |
| 4 | 2914 | 5.0% | |
| 1 | 1349 | 2.3% | |
| 3 | 834 | 1.4% |
| Value | Count | Frequency (%) | |
| 0 | 44133 | 75.2% | |
| 1 | 1349 | 2.3% | |
| 2 | 4491 | 7.7% | |
| 3 | 834 | 1.4% | |
| 4 | 2914 | 5.0% |
| Value | Count | Frequency (%) | |
| 5 | 4972 | 8.5% | |
| 4 | 2914 | 5.0% | |
| 3 | 834 | 1.4% | |
| 2 | 4491 | 7.7% | |
| 1 | 1349 | 2.3% |
Last_Inc_Quantity
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 458.5 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 44133 | 75.2% | |
| 1 | 14560 | 24.8% |
Price_1
Real number (ℝ≥0)
| Distinct count | 37 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.392074352989283 |
|---|---|
| Minimum | 1.1 |
| Maximum | 1.59 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 458.5 KiB |
Quantile statistics
| Minimum | 1.1 |
|---|---|
| 5-th percentile | 1.21 |
| Q1 | 1.34 |
| median | 1.39 |
| Q3 | 1.47 |
| 95-th percentile | 1.5 |
| Maximum | 1.59 |
| Range | 0.49 |
| Interquartile range (IQR) | 0.13 |
Descriptive statistics
| Standard deviation | 0.09113873559 |
|---|---|
| Coefficient of variation (CV) | 0.06546973256 |
| Kurtosis | -0.1785760158 |
| Mean | 1.392074353 |
| Median Absolute Deviation (MAD) | 0.07 |
| Skewness | -0.5556097282 |
| Sum | 81705.02 |
| Variance | 0.008306269126 |
| Value | Count | Frequency (%) | |
| 1.47 | 7131 | 12.1% | |
| 1.39 | 6091 | 10.4% | |
| 1.35 | 6027 | 10.3% | |
| 1.5 | 4091 | 7.0% | |
| 1.33 | 3393 | 5.8% | |
| 1.34 | 3186 | 5.4% | |
| 1.48 | 3015 | 5.1% | |
| 1.49 | 2664 | 4.5% | |
| 1.46 | 2471 | 4.2% | |
| 1.37 | 2421 | 4.1% | |
| Other values (27) | 18203 | 31.0% |
| Value | Count | Frequency (%) | |
| 1.1 | 158 | 0.3% | |
| 1.14 | 327 | 0.6% | |
| 1.17 | 116 | 0.2% | |
| 1.19 | 1102 | 1.9% | |
| 1.2 | 135 | 0.2% |
| Value | Count | Frequency (%) | |
| 1.59 | 568 | 1.0% | |
| 1.52 | 678 | 1.2% | |
| 1.51 | 1354 | 2.3% | |
| 1.5 | 4091 | 7.0% | |
| 1.49 | 2664 | 4.5% |
Price_2
Real number (ℝ≥0)
| Distinct count | 30 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.7809989266181656 |
|---|---|
| Minimum | 1.26 |
| Maximum | 1.9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 458.5 KiB |
Quantile statistics
| Minimum | 1.26 |
|---|---|
| 5-th percentile | 1.48 |
| Q1 | 1.58 |
| median | 1.88 |
| Q3 | 1.89 |
| 95-th percentile | 1.9 |
| Maximum | 1.9 |
| Range | 0.64 |
| Interquartile range (IQR) | 0.31 |
Descriptive statistics
| Standard deviation | 0.1708676874 |
|---|---|
| Coefficient of variation (CV) | 0.09593924221 |
| Kurtosis | 0.6467319018 |
| Mean | 1.780998927 |
| Median Absolute Deviation (MAD) | 0.02 |
| Skewness | -1.382151978 |
| Sum | 104532.17 |
| Variance | 0.02919576659 |
| Value | Count | Frequency (%) | |
| 1.89 | 20033 | 34.1% | |
| 1.9 | 7660 | 13.1% | |
| 1.57 | 4921 | 8.4% | |
| 1.88 | 3361 | 5.7% | |
| 1.87 | 2997 | 5.1% | |
| 1.85 | 2680 | 4.6% | |
| 1.51 | 1411 | 2.4% | |
| 1.58 | 1298 | 2.2% | |
| 1.56 | 1219 | 2.1% | |
| 1.86 | 1113 | 1.9% | |
| Other values (20) | 12000 | 20.4% |
| Value | Count | Frequency (%) | |
| 1.26 | 875 | 1.5% | |
| 1.27 | 120 | 0.2% | |
| 1.31 | 200 | 0.3% | |
| 1.35 | 1089 | 1.9% | |
| 1.36 | 572 | 1.0% |
| Value | Count | Frequency (%) | |
| 1.9 | 7660 | 13.1% | |
| 1.89 | 20033 | 34.1% | |
| 1.88 | 3361 | 5.7% | |
| 1.87 | 2997 | 5.1% | |
| 1.86 | 1113 | 1.9% |
Price_3
Real number (ℝ≥0)
| Distinct count | 21 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.0067887141567136 |
|---|---|
| Minimum | 1.87 |
| Maximum | 2.14 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 458.5 KiB |
Quantile statistics
| Minimum | 1.87 |
|---|---|
| 5-th percentile | 1.93 |
| Q1 | 1.97 |
| median | 2.01 |
| Q3 | 2.06 |
| 95-th percentile | 2.07 |
| Maximum | 2.14 |
| Range | 0.27 |
| Interquartile range (IQR) | 0.09 |
Descriptive statistics
| Standard deviation | 0.04686722504 |
|---|---|
| Coefficient of variation (CV) | 0.02335433955 |
| Kurtosis | -0.2863220376 |
| Mean | 2.006788714 |
| Median Absolute Deviation (MAD) | 0.04 |
| Skewness | -0.05086428685 |
| Sum | 117784.45 |
| Variance | 0.002196536783 |
| Value | Count | Frequency (%) | |
| 1.99 | 11053 | 18.8% | |
| 2.02 | 9563 | 16.3% | |
| 2.06 | 8148 | 13.9% | |
| 2.07 | 4962 | 8.5% | |
| 1.97 | 4580 | 7.8% | |
| 2.01 | 4512 | 7.7% | |
| 1.95 | 3860 | 6.6% | |
| 2 | 2305 | 3.9% | |
| 1.94 | 1919 | 3.3% | |
| 1.91 | 1664 | 2.8% | |
| Other values (11) | 6127 | 10.4% |
| Value | Count | Frequency (%) | |
| 1.87 | 133 | 0.2% | |
| 1.89 | 501 | 0.9% | |
| 1.91 | 1664 | 2.8% | |
| 1.93 | 1007 | 1.7% | |
| 1.94 | 1919 | 3.3% |
| Value | Count | Frequency (%) | |
| 2.14 | 271 | 0.5% | |
| 2.13 | 52 | 0.1% | |
| 2.11 | 458 | 0.8% | |
| 2.09 | 1206 | 2.1% | |
| 2.07 | 4962 | 8.5% |
Price_4
Real number (ℝ≥0)
| Distinct count | 26 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.1599453086398714 |
|---|---|
| Minimum | 1.76 |
| Maximum | 2.26 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 458.5 KiB |
Quantile statistics
| Minimum | 1.76 |
|---|---|
| 5-th percentile | 1.97 |
| Q1 | 2.12 |
| median | 2.17 |
| Q3 | 2.24 |
| 95-th percentile | 2.26 |
| Maximum | 2.26 |
| Range | 0.5 |
| Interquartile range (IQR) | 0.12 |
Descriptive statistics
| Standard deviation | 0.08982459456 |
|---|---|
| Coefficient of variation (CV) | 0.04158651342 |
| Kurtosis | 1.41483285 |
| Mean | 2.159945309 |
| Median Absolute Deviation (MAD) | 0.05 |
| Skewness | -1.331337362 |
| Sum | 126773.67 |
| Variance | 0.008068457787 |
| Value | Count | Frequency (%) | |
| 2.21 | 10899 | 18.6% | |
| 2.24 | 10872 | 18.5% | |
| 2.16 | 10830 | 18.5% | |
| 2.09 | 4891 | 8.3% | |
| 2.26 | 4655 | 7.9% | |
| 2.12 | 4481 | 7.6% | |
| 1.97 | 2198 | 3.7% | |
| 2.18 | 1869 | 3.2% | |
| 1.9 | 987 | 1.7% | |
| 2.15 | 736 | 1.3% | |
| Other values (16) | 6275 | 10.7% |
| Value | Count | Frequency (%) | |
| 1.76 | 77 | 0.1% | |
| 1.89 | 563 | 1.0% | |
| 1.9 | 987 | 1.7% | |
| 1.94 | 578 | 1.0% | |
| 1.96 | 602 | 1.0% |
| Value | Count | Frequency (%) | |
| 2.26 | 4655 | 7.9% | |
| 2.24 | 10872 | 18.5% | |
| 2.21 | 10899 | 18.6% | |
| 2.2 | 481 | 0.8% | |
| 2.19 | 507 | 0.9% |
Price_5
Real number (ℝ≥0)
| Distinct count | 44 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.65479767604314 |
|---|---|
| Minimum | 2.11 |
| Maximum | 2.8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 458.5 KiB |
Quantile statistics
| Minimum | 2.11 |
|---|---|
| 5-th percentile | 2.44 |
| Q1 | 2.63 |
| median | 2.67 |
| Q3 | 2.7 |
| 95-th percentile | 2.79 |
| Maximum | 2.8 |
| Range | 0.69 |
| Interquartile range (IQR) | 0.07 |
Descriptive statistics
| Standard deviation | 0.09827182872 |
|---|---|
| Coefficient of variation (CV) | 0.03701669231 |
| Kurtosis | 3.907127941 |
| Mean | 2.654797676 |
| Median Absolute Deviation (MAD) | 0.03 |
| Skewness | -1.52909921 |
| Sum | 155818.04 |
| Variance | 0.009657352321 |
| Value | Count | Frequency (%) | |
| 2.67 | 11794 | 20.1% | |
| 2.7 | 5582 | 9.5% | |
| 2.66 | 5207 | 8.9% | |
| 2.79 | 4347 | 7.4% | |
| 2.64 | 3615 | 6.2% | |
| 2.69 | 2491 | 4.2% | |
| 2.62 | 2334 | 4.0% | |
| 2.63 | 2217 | 3.8% | |
| 2.77 | 1698 | 2.9% | |
| 2.49 | 1677 | 2.9% | |
| Other values (34) | 17731 | 30.2% |
| Value | Count | Frequency (%) | |
| 2.11 | 114 | 0.2% | |
| 2.19 | 80 | 0.1% | |
| 2.27 | 174 | 0.3% | |
| 2.29 | 56 | 0.1% | |
| 2.34 | 461 | 0.8% |
| Value | Count | Frequency (%) | |
| 2.8 | 1157 | 2.0% | |
| 2.79 | 4347 | 7.4% | |
| 2.78 | 868 | 1.5% | |
| 2.77 | 1698 | 2.9% | |
| 2.76 | 738 | 1.3% |
Promotion_1
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 458.5 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 38512 | 65.6% | |
| 1 | 20181 | 34.4% |
Promotion_2
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 458.5 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 40169 | 68.4% | |
| 1 | 18524 | 31.6% |
Promotion_3
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 458.5 KiB |
| 0 | |
|---|---|
| 1 | 2512 |
| Value | Count | Frequency (%) | |
| 0 | 56181 | 95.7% | |
| 1 | 2512 | 4.3% |
Promotion_4
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 458.5 KiB |
| 0 | |
|---|---|
| 1 | 6917 |
| Value | Count | Frequency (%) | |
| 0 | 51776 | 88.2% | |
| 1 | 6917 | 11.8% |
Promotion_5
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 458.5 KiB |
| 0 | |
|---|---|
| 1 | 2105 |
| Value | Count | Frequency (%) | |
| 0 | 56588 | 96.4% | |
| 1 | 2105 | 3.6% |
Sex
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 458.5 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 36044 | 61.4% | |
| 1 | 22649 | 38.6% |
Marital status
Boolean
| Distinct count | 2 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 458.5 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 35620 | 60.7% | |
| 1 | 23073 | 39.3% |
Age
Real number (ℝ≥0)
| Distinct count | 56 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 38.79396180123695 |
|---|---|
| Minimum | 18 |
| Maximum | 75 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 458.5 KiB |
Quantile statistics
| Minimum | 18 |
|---|---|
| 5-th percentile | 24 |
| Q1 | 30 |
| median | 36 |
| Q3 | 46 |
| 95-th percentile | 63 |
| Maximum | 75 |
| Range | 57 |
| Interquartile range (IQR) | 16 |
Descriptive statistics
| Standard deviation | 12.05244748 |
|---|---|
| Coefficient of variation (CV) | 0.3106784385 |
| Kurtosis | -0.2225011979 |
| Mean | 38.7939618 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 0.7231417785 |
| Sum | 2276934 |
| Variance | 145.2614902 |
| Value | Count | Frequency (%) | |
| 35 | 2859 | 4.9% | |
| 27 | 2859 | 4.9% | |
| 31 | 2759 | 4.7% | |
| 32 | 2487 | 4.2% | |
| 25 | 2436 | 4.2% | |
| 26 | 2403 | 4.1% | |
| 40 | 2265 | 3.9% | |
| 37 | 2155 | 3.7% | |
| 36 | 2011 | 3.4% | |
| 33 | 1931 | 3.3% | |
| Other values (46) | 34528 | 58.8% |
| Value | Count | Frequency (%) | |
| 18 | 235 | 0.4% | |
| 19 | 106 | 0.2% | |
| 20 | 196 | 0.3% | |
| 21 | 467 | 0.8% | |
| 22 | 319 | 0.5% |
| Value | Count | Frequency (%) | |
| 75 | 72 | 0.1% | |
| 74 | 94 | 0.2% | |
| 73 | 121 | 0.2% | |
| 71 | 101 | 0.2% | |
| 70 | 92 | 0.2% |
Education
Categorical
| Distinct count | 4 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 458.5 KiB |
| 1 | |
|---|---|
| 2 | |
| 0 | |
| 3 | 1354 |
| Value | Count | Frequency (%) | |
| 1 | 37161 | 63.3% | |
| 2 | 11716 | 20.0% | |
| 0 | 8462 | 14.4% | |
| 3 | 1354 | 2.3% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Income
Real number (ℝ≥0)
| Distinct count | 499 |
|---|---|
| Unique (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 121841.64431874329 |
|---|---|
| Minimum | 38247 |
| Maximum | 309364 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 458.5 KiB |
Quantile statistics
| Minimum | 38247 |
|---|---|
| 5-th percentile | 68834 |
| Q1 | 95541 |
| median | 117971 |
| Q3 | 138525 |
| 95-th percentile | 194882 |
| Maximum | 309364 |
| Range | 271117 |
| Interquartile range (IQR) | 42984 |
Descriptive statistics
| Standard deviation | 40643.74068 |
|---|---|
| Coefficient of variation (CV) | 0.3335783993 |
| Kurtosis | 3.868321723 |
| Mean | 121841.6443 |
| Median Absolute Deviation (MAD) | 21506 |
| Skewness | 1.424871282 |
| Sum | 7151251630 |
| Variance | 1651913656 |
| Value | Count | Frequency (%) | |
| 124597 | 358 | 0.6% | |
| 106205 | 353 | 0.6% | |
| 158193 | 347 | 0.6% | |
| 69487 | 228 | 0.4% | |
| 135275 | 182 | 0.3% | |
| 123003 | 179 | 0.3% | |
| 113012 | 179 | 0.3% | |
| 147626 | 178 | 0.3% | |
| 95438 | 178 | 0.3% | |
| 81200 | 175 | 0.3% | |
| Other values (489) | 56336 | 96.0% |
| Value | Count | Frequency (%) | |
| 38247 | 114 | 0.2% | |
| 43684 | 129 | 0.2% | |
| 43805 | 98 | 0.2% | |
| 53608 | 93 | 0.2% | |
| 57480 | 118 | 0.2% |
| Value | Count | Frequency (%) | |
| 309364 | 105 | 0.2% | |
| 308529 | 109 | 0.2% | |
| 308491 | 125 | 0.2% | |
| 281923 | 135 | 0.2% | |
| 281084 | 123 | 0.2% |
Occupation
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 458.5 KiB |
| 1 | |
|---|---|
| 0 | |
| 2 |
| Value | Count | Frequency (%) | |
| 1 | 29882 | 50.9% | |
| 0 | 21032 | 35.8% | |
| 2 | 7779 | 13.3% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Settlement size
Categorical
| Distinct count | 3 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 458.5 KiB |
| 0 | |
|---|---|
| 1 | |
| 2 |
| Value | Count | Frequency (%) | |
| 0 | 32081 | 54.7% | |
| 1 | 14727 | 25.1% | |
| 2 | 11885 | 20.2% |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| ID | Day | Incidence | Brand | Quantity | Last_Inc_Brand | Last_Inc_Quantity | Price_1 | Price_2 | Price_3 | Price_4 | Price_5 | Promotion_1 | Promotion_2 | Promotion_3 | Promotion_4 | Promotion_5 | Sex | Marital status | Age | Education | Income | Occupation | Settlement size | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 200000001 | 1 | 0 | 0 | 0 | 0 | 0 | 1.59 | 1.87 | 2.01 | 2.09 | 2.66 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 47 | 1 | 110866 | 1 | 0 |
| 1 | 200000001 | 11 | 0 | 0 | 0 | 0 | 0 | 1.51 | 1.89 | 1.99 | 2.09 | 2.66 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 47 | 1 | 110866 | 1 | 0 |
| 2 | 200000001 | 12 | 0 | 0 | 0 | 0 | 0 | 1.51 | 1.89 | 1.99 | 2.09 | 2.66 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 47 | 1 | 110866 | 1 | 0 |
| 3 | 200000001 | 16 | 0 | 0 | 0 | 0 | 0 | 1.52 | 1.89 | 1.98 | 2.09 | 2.66 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 47 | 1 | 110866 | 1 | 0 |
| 4 | 200000001 | 18 | 0 | 0 | 0 | 0 | 0 | 1.52 | 1.89 | 1.99 | 2.09 | 2.66 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 47 | 1 | 110866 | 1 | 0 |
| 5 | 200000001 | 23 | 0 | 0 | 0 | 0 | 0 | 1.50 | 1.90 | 1.99 | 2.09 | 2.66 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 47 | 1 | 110866 | 1 | 0 |
| 6 | 200000001 | 28 | 1 | 2 | 2 | 0 | 0 | 1.50 | 1.90 | 1.99 | 2.09 | 2.67 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 47 | 1 | 110866 | 1 | 0 |
| 7 | 200000001 | 37 | 0 | 0 | 0 | 2 | 1 | 1.50 | 1.90 | 1.99 | 2.09 | 2.67 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 47 | 1 | 110866 | 1 | 0 |
| 8 | 200000001 | 41 | 0 | 0 | 0 | 0 | 0 | 1.35 | 1.58 | 1.97 | 2.09 | 2.67 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 47 | 1 | 110866 | 1 | 0 |
| 9 | 200000001 | 43 | 0 | 0 | 0 | 0 | 0 | 1.35 | 1.58 | 1.97 | 2.09 | 2.67 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 47 | 1 | 110866 | 1 | 0 |
Last rows
| ID | Day | Incidence | Brand | Quantity | Last_Inc_Brand | Last_Inc_Quantity | Price_1 | Price_2 | Price_3 | Price_4 | Price_5 | Promotion_1 | Promotion_2 | Promotion_3 | Promotion_4 | Promotion_5 | Sex | Marital status | Age | Education | Income | Occupation | Settlement size | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 58683 | 200000500 | 681 | 0 | 0 | 0 | 0 | 0 | 1.42 | 1.85 | 2.06 | 2.24 | 2.77 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 42 | 1 | 120946 | 1 | 0 |
| 58684 | 200000500 | 689 | 0 | 0 | 0 | 0 | 0 | 1.50 | 1.87 | 2.06 | 2.24 | 2.78 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 42 | 1 | 120946 | 1 | 0 |
| 58685 | 200000500 | 693 | 0 | 0 | 0 | 0 | 0 | 1.42 | 1.51 | 2.02 | 2.24 | 2.77 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 42 | 1 | 120946 | 1 | 0 |
| 58686 | 200000500 | 694 | 0 | 0 | 0 | 0 | 0 | 1.42 | 1.51 | 2.02 | 2.24 | 2.77 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 42 | 1 | 120946 | 1 | 0 |
| 58687 | 200000500 | 697 | 1 | 2 | 6 | 0 | 0 | 1.42 | 1.51 | 1.97 | 2.24 | 2.78 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 42 | 1 | 120946 | 1 | 0 |
| 58688 | 200000500 | 703 | 0 | 0 | 0 | 2 | 1 | 1.41 | 1.85 | 2.01 | 2.24 | 2.79 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 42 | 1 | 120946 | 1 | 0 |
| 58689 | 200000500 | 710 | 0 | 0 | 0 | 0 | 0 | 1.36 | 1.84 | 2.09 | 2.24 | 2.77 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 42 | 1 | 120946 | 1 | 0 |
| 58690 | 200000500 | 717 | 0 | 0 | 0 | 0 | 0 | 1.50 | 1.80 | 2.14 | 2.24 | 2.75 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 42 | 1 | 120946 | 1 | 0 |
| 58691 | 200000500 | 722 | 1 | 2 | 3 | 0 | 0 | 1.51 | 1.82 | 2.09 | 2.24 | 2.80 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 42 | 1 | 120946 | 1 | 0 |
| 58692 | 200000500 | 726 | 0 | 0 | 0 | 2 | 1 | 1.51 | 1.82 | 2.09 | 2.24 | 2.80 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 42 | 1 | 120946 | 1 | 0 |